Parallel Random Prism: A Computationally Efficient Ensemble Learner for Classification

نویسندگان

Frederic T. Stahl

David May

Max Bramer

چکیده

Generally classifiers tend to overfit if there is noise in the training data or there are missing values. Ensemble learning methods are often used to improve a classifier’s classification accuracy. Most ensemble learning approaches aim to improve the classification accuracy of decision trees. However, alternative classifiers to decision trees exist. The recently developed Random Prism ensemble learner for classification aims to improve an alternative classification rule induction approach, the Prism family of algorithms, which addresses some of the limitations of decision trees. However, Random Prism suffers like any ensemble learner from a high computational overhead due to replication of the data and the induction of multiple base classifiers. Hence even modest sized datasets may impose a computational challenge to ensemble learners such as Random Prism. Parallelism is often used to scale up algorithms to deal with large datasets. This paper investigates parallelisation for Random Prism, implements a prototype and evaluates it empirically using a Hadoop computing cluster.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Random Prism: An Alternative to Random Forests

Ensemble learning techniques generate multiple classifiers, so called base classifiers, whose combined classification results are used in order to increase the overall classification accuracy. In most ensemble classifiers the base classifiers are based on the Top Down Induction of Decision Trees (TDIDT) approach. However, an alternative approach for the induction of rule based classifiers is th...

متن کامل

Random Prism: a noise-tolerant alternative to Random Forests

Ensemble learning can be used to increase the overall classification accuracy of a classifier by generating multiple base classifiers and combining their classification results. A frequently used family of base classifiers for ensemble learning are decision trees. However, alternative approaches can potentially be used, such as the Prism family of algorithms which also induces classification ru...

متن کامل

A Scalable Expressive Ensemble Learning Using Random Prism: A MapReduce Approach

The induction of classification rules from previously unseen examples is one of the most important data mining tasks in science as well as commercial applications. In order to reduce the influence of noise in the data, ensemble learners are often applied. However, most ensemble learners are based on decision tree classifiers which are affected by noise. The Random Prism classifier has recently ...

متن کامل

Scalable Ensemble Learning and Computationally Efficient Variance Estimation Scalable Ensemble Learning and Computationally Efficient Variance Estimation Scalable Ensemble Learning and Computationally Efficient Variance Estimation

Scalable Ensemble Learning and Computationally Efficient Variance Estimation

متن کامل

Improving Classification Accuracy based on Random Forest Model with Uncorrelated High Performing Trees

Random forest can achieve high classification performance through a classification ensemble with a set of decision trees that grow using randomly selected subspaces of data. The performance of an ensemble learner is highly dependent on the accuracy of each component learner and the diversity among these components. In random forest, randomization would cause occurrence of bad trees and may incl...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Parallel Random Prism: A Computationally Efficient Ensemble Learner for Classification

نویسندگان

چکیده

منابع مشابه

Random Prism: An Alternative to Random Forests

Random Prism: a noise-tolerant alternative to Random Forests

A Scalable Expressive Ensemble Learning Using Random Prism: A MapReduce Approach

Scalable Ensemble Learning and Computationally Efficient Variance Estimation Scalable Ensemble Learning and Computationally Efficient Variance Estimation Scalable Ensemble Learning and Computationally Efficient Variance Estimation

Improving Classification Accuracy based on Random Forest Model with Uncorrelated High Performing Trees

عنوان ژورنال:

اشتراک گذاری